Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Efficient sequential and parallel algorithms for planted motif search

Identifieur interne : 001C73 ( Main/Exploration ); précédent : 001C72; suivant : 001C74

Efficient sequential and parallel algorithms for planted motif search

Auteurs : Marius Nicolae [États-Unis] ; Sanguthevar Rajasekaran [États-Unis]

Source :

RBID : PMC:3924400

Descripteurs français

English descriptors

Abstract

Background

Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (l,d)-motif search or Planted Motif Search (PMS). In PMS we are given two integers l and d and n biological sequences. We want to find all sequences of length l that appear in each of the input sequences with at most d mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. Despite ample research in the area, a considerable performance gap exists because many state of the art algorithms have large runtimes even for moderately challenging instances.

Results

This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (l,d) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger l and d such as (50,21). We include a comparison of PMS8 with several state of the art algorithms on multiple problem instances. This paper also presents necessary and sufficient conditions for 3 l-mers to have a common d-neighbor. The program is freely available at http://engr.uconn.edu/~man09004/PMS8/.

Conclusions

We present PMS8, an efficient exact algorithm for Planted Motif Search. PMS8 introduces novel ideas for generating common neighborhoods. We have also implemented a parallel version for this algorithm. PMS8 can solve instances not solved by any previous algorithms.


Url:
DOI: 10.1186/1471-2105-15-34
PubMed: 24479443
PubMed Central: 3924400


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Efficient sequential and parallel algorithms for planted motif search</title>
<author>
<name sortKey="Nicolae, Marius" sort="Nicolae, Marius" uniqKey="Nicolae M" first="Marius" last="Nicolae">Marius Nicolae</name>
<affiliation wicri:level="2">
<nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
<placeName>
<region type="state">Connecticut</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Rajasekaran, Sanguthevar" sort="Rajasekaran, Sanguthevar" uniqKey="Rajasekaran S" first="Sanguthevar" last="Rajasekaran">Sanguthevar Rajasekaran</name>
<affiliation wicri:level="2">
<nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
<placeName>
<region type="state">Connecticut</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">24479443</idno>
<idno type="pmc">3924400</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3924400</idno>
<idno type="RBID">PMC:3924400</idno>
<idno type="doi">10.1186/1471-2105-15-34</idno>
<date when="2014">2014</date>
<idno type="wicri:Area/Pmc/Corpus">000950</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000950</idno>
<idno type="wicri:Area/Pmc/Curation">000950</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000950</idno>
<idno type="wicri:Area/Pmc/Checkpoint">001093</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">001093</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:24479443</idno>
<idno type="wicri:Area/PubMed/Corpus">001A68</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001A68</idno>
<idno type="wicri:Area/PubMed/Curation">001A68</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">001A68</idno>
<idno type="wicri:Area/PubMed/Checkpoint">001927</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">001927</idno>
<idno type="wicri:Area/Ncbi/Merge">000C74</idno>
<idno type="wicri:Area/Ncbi/Curation">000C74</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000C74</idno>
<idno type="wicri:Area/Main/Merge">001C87</idno>
<idno type="wicri:Area/Main/Curation">001C73</idno>
<idno type="wicri:Area/Main/Exploration">001C73</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Efficient sequential and parallel algorithms for planted motif search</title>
<author>
<name sortKey="Nicolae, Marius" sort="Nicolae, Marius" uniqKey="Nicolae M" first="Marius" last="Nicolae">Marius Nicolae</name>
<affiliation wicri:level="2">
<nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
<placeName>
<region type="state">Connecticut</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Rajasekaran, Sanguthevar" sort="Rajasekaran, Sanguthevar" uniqKey="Rajasekaran S" first="Sanguthevar" last="Rajasekaran">Sanguthevar Rajasekaran</name>
<affiliation wicri:level="2">
<nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
<placeName>
<region type="state">Connecticut</region>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Computational Biology (methods)</term>
<term>DNA (chemistry)</term>
<term>DNA (genetics)</term>
<term>Proteins (chemistry)</term>
<term>Proteins (genetics)</term>
<term>Sequence Analysis, DNA (methods)</term>
<term>Sequence Analysis, Protein (methods)</term>
<term>Software</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>ADN ()</term>
<term>ADN (génétique)</term>
<term>Algorithmes</term>
<term>Analyse de séquence d'ADN ()</term>
<term>Analyse de séquence de protéine ()</term>
<term>Biologie informatique ()</term>
<term>Logiciel</term>
<term>Protéines ()</term>
<term>Protéines (génétique)</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="chemistry" xml:lang="en">
<term>DNA</term>
<term>Proteins</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="genetics" xml:lang="en">
<term>DNA</term>
<term>Proteins</term>
</keywords>
<keywords scheme="MESH" qualifier="génétique" xml:lang="fr">
<term>ADN</term>
<term>Protéines</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Computational Biology</term>
<term>Sequence Analysis, DNA</term>
<term>Sequence Analysis, Protein</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>ADN</term>
<term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Analyse de séquence de protéine</term>
<term>Biologie informatique</term>
<term>Logiciel</term>
<term>Protéines</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (
<italic>l</italic>
,
<italic>d</italic>
)-motif search or Planted Motif Search (PMS). In PMS we are given two integers
<italic>l</italic>
and
<italic>d</italic>
and
<italic>n</italic>
biological sequences. We want to find all sequences of length
<italic>l</italic>
that appear in each of the input sequences with at most
<italic>d</italic>
mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. Despite ample research in the area, a considerable performance gap exists because many state of the art algorithms have large runtimes even for moderately challenging instances.</p>
</sec>
<sec>
<title>Results</title>
<p>This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (
<italic>l</italic>
,
<italic>d</italic>
) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger
<italic>l</italic>
and
<italic>d</italic>
such as (50,21). We include a comparison of PMS8 with several state of the art algorithms on multiple problem instances. This paper also presents necessary and sufficient conditions for 3
<italic>l</italic>
-mers to have a common
<italic>d</italic>
-neighbor. The program is freely available at
<ext-link ext-link-type="uri" xlink:href="http://engr.uconn.edu/~man09004/PMS8/">http://engr.uconn.edu/~man09004/PMS8/</ext-link>
.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>We present PMS8, an efficient exact algorithm for Planted Motif Search. PMS8 introduces novel ideas for generating common neighborhoods. We have also implemented a parallel version for this algorithm. PMS8 can solve instances not solved by any previous algorithms.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Pevzner, P" uniqKey="Pevzner P">P Pevzner</name>
</author>
<author>
<name sortKey="Sze, S" uniqKey="Sze S">S Sze</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lanctot, J" uniqKey="Lanctot J">J Lanctot</name>
</author>
<author>
<name sortKey="Li, M" uniqKey="Li M">M Li</name>
</author>
<author>
<name sortKey="Ma, B" uniqKey="Ma B">B Ma</name>
</author>
<author>
<name sortKey="Wang, S" uniqKey="Wang S">S Wang</name>
</author>
<author>
<name sortKey="Zhang, L" uniqKey="Zhang L">L Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yu, Q" uniqKey="Yu Q">Q Yu</name>
</author>
<author>
<name sortKey="Huo, H" uniqKey="Huo H">H Huo</name>
</author>
<author>
<name sortKey="Zhang, Y" uniqKey="Zhang Y">Y Zhang</name>
</author>
<author>
<name sortKey="Guo, H" uniqKey="Guo H">H Guo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Davila, J" uniqKey="Davila J">J Davila</name>
</author>
<author>
<name sortKey="Balla, S" uniqKey="Balla S">S Balla</name>
</author>
<author>
<name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dinh, H" uniqKey="Dinh H">H Dinh</name>
</author>
<author>
<name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
<author>
<name sortKey="Davila, J" uniqKey="Davila J">J Davila</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ho, E" uniqKey="Ho E">E Ho</name>
</author>
<author>
<name sortKey="Jakubowski, C" uniqKey="Jakubowski C">C Jakubowski</name>
</author>
<author>
<name sortKey="Gunderson, S" uniqKey="Gunderson S">S Gunderson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Desaraju, S" uniqKey="Desaraju S">S Desaraju</name>
</author>
<author>
<name sortKey="Mukkamala, R" uniqKey="Mukkamala R">R Mukkamala</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dinh, H" uniqKey="Dinh H">H Dinh</name>
</author>
<author>
<name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
<author>
<name sortKey="Kundeti, V" uniqKey="Kundeti V">V Kundeti</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bandyopadhyay, S" uniqKey="Bandyopadhyay S">S Bandyopadhyay</name>
</author>
<author>
<name sortKey="Sahni, S" uniqKey="Sahni S">S Sahni</name>
</author>
<author>
<name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
<author>
<name sortKey="Dinh, H" uniqKey="Dinh H">H Dinh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gramm, J" uniqKey="Gramm J">J Gramm</name>
</author>
<author>
<name sortKey="Niedermeier, R" uniqKey="Niedermeier R">R Niedermeier</name>
</author>
<author>
<name sortKey="Rossmanith, P" uniqKey="Rossmanith P">P Rossmanith</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Abbas, M" uniqKey="Abbas M">M Abbas</name>
</author>
<author>
<name sortKey="Abouelhoda, M" uniqKey="Abouelhoda M">M Abouelhoda</name>
</author>
<author>
<name sortKey="Bahig, H" uniqKey="Bahig H">H Bahig</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dasari, N" uniqKey="Dasari N">N Dasari</name>
</author>
<author>
<name sortKey="Ranjan, D" uniqKey="Ranjan D">D Ranjan</name>
</author>
<author>
<name sortKey="Zubair, M" uniqKey="Zubair M">M Zubair</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dasari, N" uniqKey="Dasari N">N Dasari</name>
</author>
<author>
<name sortKey="Desh, R" uniqKey="Desh R">R Desh</name>
</author>
<author>
<name sortKey="Zubair, M" uniqKey="Zubair M">M Zubair</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dasari, Ns" uniqKey="Dasari N">NS Dasari</name>
</author>
<author>
<name sortKey="Desh, R" uniqKey="Desh R">R Desh</name>
</author>
<author>
<name sortKey="Zubair, M" uniqKey="Zubair M">M Zubair</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sahoo, B" uniqKey="Sahoo B">B Sahoo</name>
</author>
<author>
<name sortKey="Sourav, R" uniqKey="Sourav R">R Sourav</name>
</author>
<author>
<name sortKey="Ranjan, R" uniqKey="Ranjan R">R Ranjan</name>
</author>
<author>
<name sortKey="Padhy, S" uniqKey="Padhy S">S Padhy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, H" uniqKey="Sun H">H Sun</name>
</author>
<author>
<name sortKey="Low, M" uniqKey="Low M">M Low</name>
</author>
<author>
<name sortKey="Hsu, W" uniqKey="Hsu W">W Hsu</name>
</author>
<author>
<name sortKey="Tan, C" uniqKey="Tan C">C Tan</name>
</author>
<author>
<name sortKey="Rajapakse, J" uniqKey="Rajapakse J">J Rajapakse</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, Hq" uniqKey="Sun H">HQ Sun</name>
</author>
<author>
<name sortKey="Low, M" uniqKey="Low M">M Low</name>
</author>
<author>
<name sortKey="Hsu, Wj" uniqKey="Hsu W">WJ Hsu</name>
</author>
<author>
<name sortKey="Rajapakse, J" uniqKey="Rajapakse J">J Rajapakse</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Faheem, Hm" uniqKey="Faheem H">HM Faheem</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tompa, M" uniqKey="Tompa M">M Tompa</name>
</author>
<author>
<name sortKey="Li, N" uniqKey="Li N">N Li</name>
</author>
<author>
<name sortKey="Bailey, Tl" uniqKey="Bailey T">TL Bailey</name>
</author>
<author>
<name sortKey="Church, Gm" uniqKey="Church G">GM Church</name>
</author>
<author>
<name sortKey="De Moor, B" uniqKey="De Moor B">B De Moor</name>
</author>
<author>
<name sortKey="Eskin, E" uniqKey="Eskin E">E Eskin</name>
</author>
<author>
<name sortKey="Favorov, Av" uniqKey="Favorov A">AV Favorov</name>
</author>
<author>
<name sortKey="Frith, Mc" uniqKey="Frith M">MC Frith</name>
</author>
<author>
<name sortKey="Fu, Y" uniqKey="Fu Y">Y Fu</name>
</author>
<author>
<name sortKey="Kent, Wj" uniqKey="Kent W">WJ Kent</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Connecticut</li>
</region>
</list>
<tree>
<country name="États-Unis">
<region name="Connecticut">
<name sortKey="Nicolae, Marius" sort="Nicolae, Marius" uniqKey="Nicolae M" first="Marius" last="Nicolae">Marius Nicolae</name>
</region>
<name sortKey="Rajasekaran, Sanguthevar" sort="Rajasekaran, Sanguthevar" uniqKey="Rajasekaran S" first="Sanguthevar" last="Rajasekaran">Sanguthevar Rajasekaran</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001C73 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001C73 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     PMC:3924400
   |texte=   Efficient sequential and parallel algorithms for planted motif search
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:24479443" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021